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Abstract 

Ideally, the time that an incremental algorithm uses to 
process a change should be a function of the size of the 
change rather than, say, the size of the entire current 
input. Based on a formalization of "the set of things 
changed" by an incremental modification, this paper 
investigates how and to what extent it is possible to 
give such a guarantee for a chart-based parsing frame- 
work and discusses the general utility of a minimality 
notion in incremental processing. 1 



1 Introduction 

1.1 Background 

Natural-language computing has traditionally been un- 
derstood as a "batch-mode" or "once-only" process, in 
which a problem instance P (say, a text) is mapped 
as a whole to a solution S (such as an analysis of the 
text). However, in highly interactive and real-time ap- 
plications — for example, grammar checking, structure 
editing and on-line translation — what is required is ef- 
ficient processing of a sequence of small changes of a 
text. Exhaustive recomputation is then not a feasible 
alternative. Rather, to avoid as much recomputation 
as possible, each update cycle must re-use those parts 
of the previous solution that are still valid. We say that 
an algorithm is incremental if it uses information from 
an old solution in computing the new solution. 

The problem of incremental processing can be stated 
as follows, using a notation similar to that of Alpern et 
al. [1]: Assume given a problem instance P (a represen- 
tation of the current input), a solution S (the current 
output), and a modification Ap to P. 2 The modifica- 
tion results in a new problem instance P' = P © Ap, 
where © is a composition operator. The task of an in- 



1 I would like to thank Ralph Ronnquist as well as Gregor 
Erbach and other colleagues in Saarbriicken for discussions on 
the material presented here, Peter Fritzson for originally alerting 
my attention to Ramalingam and Reps' paper, and the anony- 
mous referees. This research has been funded by the German 
Science Foundation (DFG) through the Sonderforschungsbereich 
314, project N3 (BiLD). 

2 A terminological note: we use "input change" and "modifi- 
cation" as well as "output change" and "update" synonymously. 
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Figure 1: Batch-mode versus incremental computation. 



cremental algorithm is then to produce an update As 
in the old solution such that S © As is a solution to 
P © Ap (see figure 1). At this point, nothing is stipu- 
lated about the amount of information in S that should 
be re-used in 5". 

To show properties such as correctness and complex- 
ity of incremental algorithms, it is necessary to estab- 
lish a formal measure of "the set of things changed" . 
This measure should capture the minimal change re- 
sulting from a modification and, moreover, should be 
independent of any particular algorithms for incremen- 
tal update. One way of achieving this is to compare 
the results obtained by batch-mode processing of the 
inputs before and after the change, respectively (Wiren 
and Ronnquist [15, 17]): By forming the "difference" 
between the batch-mode solutions S and 5" obtained 
before and after a modification Ap to P, we obtain a 
parameter As mm which captures the minimal change 
in a way which is indeed independent of the incremen- 
tal update. Given that As corresponds precisely to 
what any sound and complete incremental algorithm 
must do, it can be used as a basis for correctness proofs 
for such algorithms (given that the batch-mode algo- 
rithm is correct). 

Furthermore, A? can be used as a basis of com- 
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plexity analyses: Ideally, each update cycle of an in- 
cremental algorithm should expend an amount of work 
which is a polynomial function of the size of the change, 
rather than, say, the size of the entire current input. 
However, making this notion precise in a way which is 
independent of particular incremental algorithms is not 
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always straightforward. Two early approaches along 
these lines are Goodwin [3, 4] (reason maintenance) 
and Reps [11] (language-based editing). More recently, 
Alpern et al. [1] and Ramalingam and Reps [9, 10] have 
provided a framework for analysing incremental algo- 
rithms, in which the basic measure used is the sum of 
the sizes of the changes in the input and output. This 
framework assumes that the modification of the input 
can be carried out in 0(|Ap|) time, where the generic 
notation \X\ is used for the size of X. Furthermore, it 
assumes that IAq I denotes the minimal IAqI such 

I > J min 1 l^l 

that S © As solves P © Ap. Alpern et al. then define 

*=|Ap| + |A SfBi J 

as the intrinsic size of a change. 

The choice of 8 is motivated as follows: |Ap|, the size 
of the modification, is in itself too crude a measure, 
since a small change in problem instance may cause 
a large change in solution or vice versa. IAq I is 
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then chosen as a measure of the size of the change in 
the solution, since the time for updating the solution 
can be no less than this. The 8 measure thus makes 
it possible to capture how well a particular algorithm 
performs relative to the amount of work that must be 
performed in response to a change. 

An incremental algorithm is said to be bounded if 
it can process any change in time 0(f(8)), that is, in 
time depending only on 8. Intuitively, this means that 
it only processes the "region" where the input or output 
changes. Algorithms of this kind can then be classified 
according to their respective degrees of boundedness 
(see Ramalingam and Reps [10, section 5]). For exam- 
ple, an algorithm which is linear in 8 is asymptotically 
optimal. Furthermore, an incremental algorithm is said 
to be unbounded if the time it takes to update the so- 
lution can be arbitrarily large for a given 8. 

It might seem that what has been discussed so far has 
little relevance to natural-language processing, where 
incrementality is typically understood as the piecemeal 
assembly of an analysis during a single left-to-right 3 
pass through a text or a spoken utterance. In par- 
ticular, incrementality is often used as a synonym for 
interleaved approaches, in which syntax and seman- 
tics work in parallel such that each word or phrase is 
given an interpretation immediately upon being recog- 
nized (see, for example, Mellish [7] and Haddock [5]). 
However, the two views are closely related: The "left- 
to-right view" is an idealized, psycholinguistically mo- 
tivated special case, in which the only kind of change 
allowed is addition of new material at the end of the 
current input, resulting in piecemeal expansion of the 
analysis. Moreover, the interleaving is just a conse- 
quence of the fact that every piece of new input must, 
in some sense, be fully analysed in order to be inte- 
grated with the old analysis. 

To distinguish this special case from the general case, 
in which arbitrary changes are allowed, Wiren [15] 

3 Strictly speaking front-to-back or beginning-to-end. 



refers to them as left-to-right (LR) incrementality and 
full incrementality, respectively. The former case cor- 
responds to on-line analysis — that each prefix of a 
string is parsed (interpreted) before any of the input 
beyond that prefix is read (Harrison [6, page 433]). 
The latter case has long been studied in interactive 
language-based programming environments (for exam- 
ple, Ghezzi and Mandrioli [2]), whereas the only pre- 
vious such work that we are aware of in the context of 
natural-language processing is Wiren and Ronnquist 
[14, 15, 16, 17]. 

1.2 The Problem 

The aim of this paper is to begin to adapt and ap- 
ply the notion of bounded incremental computation 
to natural-language parsing, using a method for estab- 
lishing minimal change previously introduced by Wiren 
and Ronnquist [15, 17]. To this end, the paper shows 
how the 8 parameter can be defined in a fully incremen- 
tal, chart-based parsing framework, briefly describes a 
previous, unbounded algorithm, and then shows how a 
polynomially bounded algorithm can be obtained. 

2 Batch-Mode Chart Parsing 

An incremental problem can be defined by specifying 
its batch-mode version and the set of allowable modifi- 
cations. We thus begin by specifying batch-mode chart 
parsing, restricting ourselves to a standard context-free 
grammar without cyclic or empty productions. 

Definition 1 (Chart) A chart is a directed graph 
C = (V, E) such that V is a finite, non-empty set of 
vertices and E C V x V x R is a finite set of edges, 
where R is the set of dotted context-free rules obtained 
from the grammar. 4 

The vertices v\, v n+ \ £ V correspond to the lin- 
ear positions between the tokens r = t\ ■ ■ ■ t n of an 
n-token text. 5 An edge e££ between vertices i> 8 - and 
Vj carries information about a (partially) analysed con- 
stituent between the corresponding positions. 

The algorithm makes use of an agenda (see Thomp- 
son [12]). Agenda tasks are created in response to to- 
kens being read and edges being added to the chart, 
and may be ordered according to their priorities. To 
define the agenda, we make use of the set of possible 
tokens Tkns and the set of possible edges Edgs. 

Definition 2 (Agenda) We define the agenda as 
Agda C Tkns U Edgs U (Edgs x Edgs). We refer to 
the three types of tasks that it contains as scanning, 
prediction and combination tasks, respectively. 

4 For brevity, we omit a fourth edge component correspond- 
ing to the set of (partial) parse trees according to the grammar 
and lexicon (assuming that only the topmost portion of a tree 
corresponding to the dotted rule needs to be stored in an edge). 

5 We shall use r interchangeably to denote a sequence and a 
set of tokens. 



Each agenda task is executed by a step of the algo- 
rithm below. We specify two versions of batch-mode 
chart parsing — the basic bottom-up (strictly speak- 
ing, left-corner) and top-down (Earley-style) strategies 
— assuming that the one or the other is chosen. 

Algorithm 1 (Batch-mode chart parsing) 

Input: A sequence of tokens r = t\ ■ ■ ■ t n . 
Output: A chart. 

Initialization: If the top-down strategy is used, then 
add an agenda task corresponding to an initial top- 
down prediction (vi, vi, S ^ .a) for each rule S —kx, 
where S is the start category of the grammar. 

Method: For each token, create a scanning task. 
While the agenda is not empty, remove the next task 
and execute the corresponding step below: 

Scan: Given a token t at position j , for each lexical en- 
try of the form X —>■ t, add an edge (vj , i>j+i, X — ► 
t.). 6 Add resulting new tasks to the agenda. 

Predict 1 (Bottom-up): If the edge is of the form 
(vj,Vk,X —> a.}, then, for each rule of the form 

Y — ► Xj, add an edge (vj,Vj,Y — ► *Xy) unless 
it already exists. Add resulting new tasks to the 
agenda. 

Predict 2 (Top-down): If the edge is of the form 
(vi,Vj , X —kx*Y /3), then, for each rule of the form 

Y — ► 7, add an edge (vj,Vj,Y — ► .7) unless it 
already exists. Add resulting new tasks to the 
agenda. 

Combine: If the first edge is of the form (vi , vj , X — ► 
a.Y/3) and the second is of the form (vj,Vk,Y —> 
7.), then add an edge (vi,Vk,X —> aY.fi). Add 
resulting new tasks to the agenda. 

3 Incremental Chart Parsing 
3.1 The Problem 

The overall incremental process can be thought of as a 
change-update loop, where each change of the input is 
immediately followed by a corresponding update of the 
output. To completely specify the state of this process, 
we shall make use of a configuration consisting of (a 
representation of) an input text r, a chart C and an 
edge-dependency relation V (to be defined in section 4). 
The problem of incremental chart parsing can then be 
specified abstractly as a mapping 

/«r, C,V),A t )^{t',C',V) 

from an old configuration and a modification A r to a 
new configuration. We shall allow two kinds of change, 
namely, insertion and deletion of m > 1 contiguous 
tokens. We assume that a modification A r is given as 

6 We refer to the new edge as a lexical edge. 



a vertex pair v j , Vj +m £ V defining the update interval 
and, in the case of an insertion, a sequence of tokens 
r = tj ■ ■ ■ tj +m . We furthermore assume that either the 
bottom-up or top-down strategy is chosen throughout 
a change-update session, and, in the latter case, that 
the top-down initialization is made before the session 
is started. 

3.2 A General Vertex Mapping 

How can the minimal change Ac be defined in a 
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chart-based framework? One way of doing this is to 
compare the charts C = (V, E) and C" = (V, E') that 
are obtained by batch-mode parsing of the texts before 
and after a change, respectively. We thereby obtain a 
measure which is independent of particular incremental 
update algorithms. Intuitively, only those edges that 
are in E but not in E' must be removed, and only those 
edges that are in E' but not in E must be generated 
anew. If the change is small, then a large fraction of 
the edges are in E fl E' (that is, are unchanged). 

However, to be able to compare the edge sets in the 
two charts, we must first establish a one-to-one map- 
ping between their vertices. Let us consider the case in 
which a single token t{ is deleted from an n-token text. 
The problem is that, because of the removed token, the 
two vertices and i> 8 +i would seem to correspond to a 
single vertex in V' . However, we can regard this single 
vertex as consisting of a "left half" and a "right half" , 
which we assign different indices. In other words, af- 
ter having increased each index of v' i+1 , . . . , v' n £ V' by 
one, we "split" vertex v\ and assign the index i + 1 to 
its "right half" . The incoming non-predicted edges as 
well as (looping) top-down predictions at the split ver- 
tex are then associated with its left half, and the outgo- 
ing non-predicted edges as well as (looping) bottom-up 
predictions are associated with its right half. 7 The rea- 
son for dividing the predicted edges in this way is that 
a top-down prediction is made at the ending vertex of 
the triggering edge (that is, from the left), whereas a 
bottom-up prediction is made at the starting vertex of 
the triggering edge (that is, from the right). 

The mapping can be generalized to the case in which 
m contiguous tokens are deleted. This is done by in- 
creasing the index of each vertex from the "right half" 
of the split vertex and onwards by m (instead of one). 
Furthermore, by using the same mapping but in the 
opposite direction, we can also cover insertion of m 
contiguous tokens. To express this generalized map- 
ping, assume that V is the set of vertices of the larger 
chart and V is that of the smaller chart. A deletion of 
m contiguous tokens then involves a mapping from V 
to V and an insertion of m tokens involves a mapping 
from V to V . In terms of the indexing that holds before 
the vertices in V are renumbered, and assuming that V 
has n + l vertices, we obtain the following bidirectional 

7 As mentioned above, we assume that only the one or the 
other strategy is used, so that it is known beforehand which 
kind of predictions the chart contains. 




Figure 2: Chart of the sentence "The old man the tall 
ships" under bottom-up parsing. Inactive edges are 
drawn using continuous lines, active edges using dashed 
lines, and predicted (looping) edges are depicted below 
the vertices. 
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Figure 3: Chart of the sentence "The old man the 
ships" under bottom-up parsing. 



mapping: 

• Vertices v\, . . . , £ V correspond to 

Vi-i £ V , respectively. 

• Vertex Vi corresponds to the "left half" of vertex 

Vi. 

• Vertices . . . , 5;+ m _i £ V do not correspond 
to any vertices in V . 

• Vertex «;+ m corresponds to the "right half" of 
vertex V{. 

• Vertices i>i +m+ i , . . . , 5 n +i correspond to 
v i+ i, . . .,w„ + i_ m , respectively. 

The mapping is thus established with respect to inser- 
tion or deletion of an arbitrary number of contiguous 
tokens. 8 



3.3 Minimal Change 

Assume that E and E' are the sets of edges of the 
charts C and C" obtained by batch-mode parsing of a 
text before and after a modification A r , respectively. 
We can then define the minimal output change on the 
basis of two edge sets as follows: 



8 Presumably, it is possible to generalize the mapping to more 
complex (non-contiguous) operations such as replacements or re- 
versals. However, we do not pursue that here. 
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Figure 4: Example grammar and lexicon. 



Definition 3 (Minimal output change) We define 
the set of missing edges as the set difference M = E\E' 
and the set of new edges as the set difference N = 
E' \ E. We then define the minimal output change as 
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Next, we can define the size of the minimal change as 
follows: 

Definition 4 (Size of minimal change) We define 
the size of the minimal change as 6 = |A r | + \Ac |, 
the sum of the number of inserted or deleted tokens 
and the number of edges in Ac 
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3.4 An Example 

As an illustration, the chart in figure 2 is obtained un- 
der (batch-mode) bottom-up parsing, given the gram- 
mar in figure 4 and the sentence "The old man the 
tall ships". If the token "tall" is removed, the chart 
in figure 3 is obtained. Vertex v$ in figure 2 then cor- 
responds to the left half of vertex v' 5 in figure 3, and 
vertex vg corresponds to the right half of vertex v' 5 . 
Furthermore, vj corresponds to v' 6 . Clearly, the input 
change A r consists of the token "tall" . The output 
change Ac mm consists of the missing set M , which 
contains the three edges A26, NP27 and NP34 in fig- 
ure 2, and the new set N , which contains the single 
edge NP32 in figure 3. The size of the change is then 
8 = \A T \ + \A C ■ 1 = 1 + 3 + 1 = 5. 
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If instead "tall" is inserted before the last word in the 
sentence in figure 3, then the input change still consists 
of the token "tall". However, the two sets making up 
the output change are reversed: the missing set con- 
tains the single edge NP32 in figure 3 and the new set 
contains the three edges A26, NP27 and NP34 in fig- 
ure 2. Thus, the size of the change is again 5. 

4 An Unbounded Algorithm 

A key idea of the incremental chart-parsing algorithm 
put forward by Wiren [14, 15] is to use edge depen- 
dencies for keeping track of edges that have to be re- 
moved in response to a change. An edge e' is said to 
depend upon another edge or token e if it is formed 
(derived) directly on the basis of e. Furthermore, if e' 
is redundantly proposed by an edge /, then e' can be 
said to depend (also) on /. By e' being "redundantly 
proposed" , we mean that the parser attempts to add 




Figure 5: Edge-dependency graph induced by V . The 
nodes of the graph correspond to the chart edges in 
figure 3. A dummy root node is shown instead of 
nodes corresponding to the tokens. 

an edge that is equivalent to e' to the chart, but that 
that edge is rejected by the standard redundancy test 
in chart parsing. In effect, / provides an additional 
"justification" for e'. 

Given a chart C = (V, E) and a set of tokens r, these 
conditions correspond to the following dependency re- 
lation on E and r: 

Definition 5 (Edge dependency) We define V as a 
binary relation on the set of chart edges and the set 
of tokens E U t such that V(s, d) holds if and only if 
d G E is formed, or is redundantly proposed, directly 
using s £ J?Ut according to a chart-parsing algorithm. 
We say that d is a dependent (or derivative) edge of s, 
and that s is a source edge (token) of d. 

V can be illustrated by a graph. The dependency graph 
corresponding to the chart in figure 3 is shown in fig- 
ure 5. 

On the basis of the dependency relation, Wiren and 
Ronnquist [15, 17] define different disturbance sets, 
given as functions from tokens to sets of edges, and con- 
taining edges that need to be removed from the chart 
in response to a token-level change. The simplest such 
set is V*(tj), the transitive closure of V(tj). Wiren 
and Ronnquist [15, 17] discuss this and other alterna- 
tives and show completeness of V* with respect to the 
missing set. 

The algorithm performs an update essentially by re- 
moving the entire disturbance set and then generat- 
ing all possible edges. The latter set includes not only 
the new edges, but also disturbed, non-missing edges, 
which have to be generated anew. The complexity anal- 
ysis of the algorithm yields that it is unbounded incre- 
mental in both its bottom-up and top-down version (see 
Wiren [16]). The source of this is that the algorithm 



removes the entire disturbance set, whose size depends 
on n, the size of the entire input. 

5 A Bounded Algorithm 

5.1 Intuitive Idea 

Intuitively, a bounded incremental algorithm only pro- 
cesses the region where the input or output changes 
during an update cycle. In our case, the problem in 
achieving this is that the missing and new edges are 
not a priori known — when the incremental update 
begins, only a set of potentially missing edges (the dis- 
turbance set) is known. However, the update can be 
limited by using a change-propagation algorithm (com- 
pare Ramalingam and Reps [10, page 21]): By initially 
retaining the disturbance set, new and old edges can be 
compared during reparsing. If a new edge e' is different 
from the corresponding old edge e (if this exists), then 
the dependants of e are regarded as disturbed (poten- 
tially missing). If e' is equivalent to e in the sense of 
giving rise to the same derivative edges, then the de- 
pendants of e are known not to be missing, and hence 
the reparsing process does not have to proceed beyond 
this point in the search space. In order to avoid extra 
computation, the disturbed edges should be visited in 
the order given by the dependency graph. 

How can the points at which a change "dies out" be 
characterized? Since we are interested in characteriz- 
ing the conditions under which two edges give rise to 
the same derivative edges, the contents part of an edge 
(that is, the right-hand side before the dot of the dotted 
rule) is irrelevant. For example, we want to say that 
the new edge NP32 in figure 3 to be reparsing-equiva- 
lent with edge NP34 in figure 2 although their dotted 
rules and parse trees are different: the dotted rule of 
the former is NP — ► Det N . and that of the latter 
is NP — ► Det AN.. We can summarize this in the 
following definition: 

Definition 6 (Reparsing-equivalent edges) 

Assume given a proposed edge e and a disturbed edge 
e' G C. We say that e = (vi,vj,X —kx*/3) and e' = 
(v s , Vf, Y —t/i.i/) are equivalent from the point of view 
of reparsing if i = s, j = t, X = Y and [3 = v. 

Inactive (combined or lexical) edges and predicted 
edges are special cases under this definition. In the 
former case, [3 and v are empty, and thus two inac- 
tive edges are reparsing-equivalent if i = s, j = t and 
X = Y . In the latter case, a and /j, are empty, and thus 
two predicted edges e and e' are reparsing-equivalent if 
e = e'. 

5.2 The Algorithm 

We now specify a bounded incremental chart-parsing 
algorithm that handles one update cycle. 9 In compari- 

9 The algorithm is currently being implemented. 



son with the unbounded algorithm, the differences are 
in the reparse and remove steps. 

Algorithm 2 (Incremental Chart Parsing) 

Input: A configuration (t,C,T>) and a modification 
A r corresponding to insertion or deletion of m tokens 

ti i • • • i ti-\-m • 

Output: An updated configuration (r' , C" ,V) . 
Method: Do the following steps: 

Modify the problem instance: 

Insert or delete the modified tokens given by A r 
into or from r. 

Prepare the chart: Do one of the following steps in 
the case of insertion or deletion, respectively: 

Insertion: Renumber edges as follows: First, re- 
place each edge (vj,Vk,r) where j > i and k ^ 
i with an edge (vj +m , v^, r) . Secondly, replace 
each edge (vj,Vk,r) where k > i with an edge 
(vj , Vk+ m , r) . Looping edges at the "modification 
vertex", which have the form (vi,Vi,r), are dealt 
with differently depending on where their sources 
are located, which in turn depends on the predic- 
tion strategy: 

• Bottom-up case: If the looping edge depends 
on an outgoing, non-looping edge ((vi,Vj,r) 
such that j > i), then the looping edge is re- 
placed with an edge («;+ m , fi+m, r) (in effect, 
it is moved). 

• Top-down case: If the looping edge de- 
pends on an incoming, possibly looping edge 
((vj, Vk, r) such that k < i), then do nothing. 

Finally, update the dependency relation V so that 
any edge (vj,Vk,r) such that j < i and k > i is 
made dependent on ti. 

Deletion: Renumber edges as follows: First, re- 
place each edge (vj ,Vk,r) where j > i with an edge 
r). Then replace each edge (vj,v k ,r) 
where k > i with an edge (vj, v^-m, r). 

Reparse: Do the following steps: 

In the case of insertion: create a scanning task for 
each new token; create a combination task for each 
active-inactive edge pair meeting at and i> 8 +m • 

In the case of deletion: create a combination task 
for each active-inactive edge pair meeting at i> 8 - 

Reparse while visiting the disturbed edges in the 
order given by the dependency graph and treating 
the disturbed edges as "sleeping" (that is, they 
do not play any role in the parsing process as 
such). Whenever a new edge is proposed, check 
if an equivalent edge exists in the disturbance set 
according to definition 6. If so, install the new 
edge, update V by letting the new edge inherit 
the dependencies from the old edge. Do not add 



any agenda items for the new edge (thereby dis- 
continuing reparsing along this path). Mark the 
new edge as re-created (with respect to a repars- 
ing-equivalent one). 

Remove edges: Remove each edge that is in the dis- 
turbance set but not in the dependency set of any 
re-created edge. 

5.3 Incremental Complexity 

For the purpose of analysing the incremental complex- 
ity of algorithm 2, we assume that adding or removing 
an edge takes unit time. We also assume that no edge 
has more than a constant number of sources or depen- 
dants and, hence, that the time required to install or 
examine the dependencies of A; edges is O(k). 10 

We first focus on the reparsing step. 11 Consider the 
case of a deletion within a text. The set of new edges N 
are generated as a result of joining two sub charts, which 
we assume have length i and j and contain 0(i 2 ) and 
0(j 2 ) edges, respectively (disregarding the grammar 
constant \G\). The joined chart thus has length i + j 
and consists of 0((i + j) 2 ) edges. The number of new 
edges resulting from joining the subcharts is then \N\ = 
0((i + j) 2 )-(0(i 2 ) + 0(j 2 )) = O(i-j) edges. Since the 
algorithm generates these edges by invoking a 0(n 3 ) 
reparsing algorithm, the new edges require 0((i+j) 3 ) — 
(0(i 3 ) + 0(i 3 )) = 0(i-j-(i + j)) = 0(i 2 -j 2 ) = 0(\N 2 \) 
time. The insertion case can be obtained in a similar 
way and gives the same result. In the remove step, 
the missing edges are found by following dependency 
chains originating from tokens until a reparsing-equi- 
valent edge is found or the chain ends. This step can 
therefore be executed in 0(|M|) time. The algorithm 
as a whole then requires 0(8 2 ) time. 

6 Conclusions 

The boundedness criterion used here provides a guar- 
antee that the next update state is never more than 
an amount of computation away from the current state 
that is limited by the size of the change. This crite- 
rion is very strong. It can be thought of as consti- 
tuting one extreme point of a continuum of ways in 
which to measure the complexity of incremental algo- 
rithms. At the other extreme, we have the option of us- 
ing | P' | + 1 5" | , the cost of discarding the old solution and 

10 This assumption is considered too strong in reason mainte- 
nance, where, in the worst case, all formulas can be directly con- 
nected (see Goodwin [4, page 110 f.]). However, it seems appro- 
priate here, since under a context-free grammar of the kind used 
here only predicted edges may have multiple sources. Moreover, 
the number of these sources is limited because of the linearity of 
the problem instance (the text). 

11 Since we take addition and removal of edges to be the pri- 
mary tasks of incremental update, we disregard the chart-prepa- 
ration step. Although a more thorough analysis might take this 
step into account both in the definition of 8 and in the complex- 
ity analysis, we do not believe that anything fundamental would 
be altered by this. 



invoking a batch-mode algorithm on the modified prob- 
lem instance. This measure might be used for showing 
that an algorithm with poor worst-case incremental be- 
haviour is still practical: Poor incremental behaviour 
means that the algorithm does not respond quickly to 
(some) small changes. However, it may still perform 
better than discarding the old solution and invoking a 
batch-mode algorithm. In other words, even if the al- 
gorithm is unbounded in 8, it may have a lower time 
bound in |P'| + \S'\ than the batch-mode algorithm. 
The unbounded algorithm described in section 4 is an 
example of this: it is clearly more efficient than the 
batch-mode algorithm for the purpose of incremental 
update. 

Several interesting topics for further research present 
themselves: One is to generalize the notions of minimal 
change and bounded incrementality to other process- 
ing frameworks that make use of a table or a chart, for 
example, pseudo-parallel LR parsing (Tomita [13]) or 
tabular generation (Neumann [8]). Another interesting 
topic is to translate the same notions to a unification- 
based grammar formalism. Defining minimal change 
then requires a definition of the difference between two 
feature structures. An immediate observation is that 
this is itself hardly a feature structure, but rather the 
set of (sub)structures that are not present in both fea- 
ture structures (in analogy with our definition of the 
difference between two charts). 
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